Skip to content

QLNI/NodeMind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

NodeMind

Binary document intelligence. 32× smaller than float32 RAG. 48× smaller than HNSW. 96× with BGE-base PCA.

Real numbers, real data, reproducible. See benchmark below.


What is NodeMind?

NodeMind replaces float32 vector indexes with compact binary fingerprints. Instead of storing thousands of bytes per chunk (BGE-M3 float32 = 4,096 bytes), it stores 128 bytes. Retrieval uses Multi-Index Hashing (MIH) — pure integer arithmetic, no GPU, no external vector database.

Upload a PDF → get a 64 MB index instead of a 2 GB one. Query it on any CPU.

Live demo: nodemind.space


Real-World Benchmark

Tested on 500,000 chunks from a mixed real-world corpus: Wikipedia, arXiv papers, and Project Gutenberg books.
Embedded with BGE-M3 (1024-dim) on an NVIDIA A40 GPU.
Recall measured against exact cosine top-k ground truth on float32 embeddings.

Retrieval Accuracy — BGE-M3 (1024-bit fingerprints)

Metric NodeMind MIH
Recall@1 0.999
Recall@3 0.999
Recall@5 1.000
Recall@10 1.000
Recall@20 1.000
MRR@10 0.9992

1,000 queries sampled from the same corpus. Ground truth = exact cosine top-20 on float32.

Retrieval Accuracy — BGE-base (768-bit and 256-bit fingerprints)

Metric 768-bit 256-bit (PCA)
Recall@1 0.999 1.000
Recall@5 1.000 1.000
Recall@10 1.000 1.000
MRR@10 0.9995 1.000

Same 500K corpus, same evaluation protocol.

Index Size — 500,000 chunks

Index Size Compression
NodeMind BGE-M3 (1024-bit) 64 MB
Float32 RAG (BGE-M3) 2,048 MB 32× smaller
HNSW index (float32 × 1.5×) 3,072 MB 48× smaller
NodeMind BGE-base (256-bit PCA) 16 MB 96× vs float32

Index only — document text stored separately in all systems equally.

Dataset

Source Volume Description
Wikipedia (Simple English) ~100 MB raw text General knowledge articles
arXiv papers ~40 MB raw text Computer science & ML abstracts
Project Gutenberg books ~28 MB raw text Public domain prose
Total raw corpus ~168 MB 642,939 paragraphs
Chunks 500,000 400 words/chunk, 50-word overlap
Embedding model BGE-M3 1024-dim, unit-normalised float32
Hardware NVIDIA A40 (46GB) 42.5 min to embed 500K chunks

Download — Verify It Yourself

All indexes were generated from the same 500,000 chunks. Download NodeMind + float32 RAG side by side to verify the compression ratios yourself.

File Size What it is
NodeMind BGE-M3 Index (32×) 64 MB Binary fingerprints + index metadata
Float32 RAG Index (baseline) 2,048 MB Raw float32 embeddings — verify the 32× yourself
HNSW Size Reference <1 KB HNSW = float32 × 1.5× overhead — explains the 48× number
NodeMind BGE-base 256-bit (96×) 16 MB PCA-compressed binary — verify the 96× yourself
Corpus ~144 MB 500K text chunks (shared by all indexes)
Benchmark PDF ~2 MB Full methodology and results report

Full interactive benchmark page: nodemind.space/benchmark

Verify compression yourself

import pickle

# Load NodeMind index
with open("nm_bgem3_index.pkl", "rb") as f:
    nm = pickle.load(f)
# nm["fps"]  — (500000, 128) uint8  = 64 MB binary fingerprints
# nm["ctv"]  — (1024,) float32      = index metadata

# Load float32 RAG index
with open("rag_bgem3_index.pkl", "rb") as f:
    rag = pickle.load(f)
# rag["embeddings"] — (500000, 1024) float32 = 2,048 MB

nm_bytes  = nm["fps"].nbytes
rag_bytes = rag["embeddings"].nbytes
print(f"NodeMind : {nm_bytes  / 1e6:.0f} MB")
print(f"Float32  : {rag_bytes / 1e6:.0f} MB")
print(f"Ratio    : {rag_bytes // nm_bytes}×")   # → 32

# BGE-base 256-bit (96×)
with open("nm_bgebase256_index.pkl", "rb") as f:
    nm96 = pickle.load(f)
# nm96["fps"] — (500000, 32) uint8 = 16 MB
# float32 baseline for BGE-base = 500000 × 768 × 4 = 1,536 MB → 96×

Run a query

import numpy as np
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("BAAI/bge-m3")

with open("corpus.pkl", "rb") as f:
    corpus = pickle.load(f)
chunks = corpus["chunks"]   # list of 500,000 strings

POPCOUNT = np.array([bin(i).count('1') for i in range(256)], dtype=np.int32)
fps = nm["fps"]

def query_nodemind(text, top_k=5):
    emb = model.encode([text], normalize_embeddings=True)[0]
    # binarisation uses index metadata — details in the patent
    q_fp = _binarise(emb, nm)
    dists = POPCOUNT[np.bitwise_xor(fps, q_fp[np.newaxis, :])].sum(axis=1)
    top   = np.argsort(dists)[:top_k]
    return [(int(dists[i]), chunks[i][:120]) for i in top]

results = query_nodemind("What is quantum entanglement?")
for dist, text in results:
    print(f"  [{dist:4d}] {text}")

The _binarise function uses the index metadata stored in the pkl file. The full binarisation method is covered under AU 2026901656 — the index is self-contained and works without reading the patent.


How It Works

1. Embed

Text is chunked and embedded with a sentence embedding model (BGE-M3 or BGE-base), producing a high-dimensional float32 vector per chunk.

2. Binarise

Each embedding is converted to a compact binary fingerprint using the index's pre-computed metadata vector.
The result is 1024 bits (128 bytes) per chunk for BGE-M3, or 256 bits (32 bytes) with BGE-base PCA.
The binarisation is integer-only — no learned projection, no GPU needed at query time.
(The exact method is patent-protected — AU 2026901656.)

3. Index (MIH)

The binary fingerprints are stored in a Multi-Index Hash structure. At query time, candidates are found in matching hash buckets and re-ranked by full Hamming distance. Pure integer arithmetic, runs on any CPU.
(The MIH structure follows Norouzi et al. CVPR 2012. The novel contribution — CTV binarisation and portable single-file format — is covered under AU 2026901657.)

4. Why It's Smaller

  • BGE-M3 float32 → binary: 32× vs float32, 48× vs HNSW
  • BGE-base + PCA to 256-bit → binary: 96× vs float32

The index is a single portable .pkl file. No server, no Docker, no external DB.


Honest Caveats

  • Self-retrieval benchmark. Queries are perturbed versions of corpus chunks — optimistic for binary methods. End-to-end QA accuracy on BEIR / MS MARCO has not yet been measured; results may differ on out-of-distribution queries.
  • HNSW comparison is index-size only. Real FAISS HNSW achieves recall@10 of 0.95–0.99 on most corpora. NodeMind achieves recall@10 of 1.000 on this benchmark, but this is a self-retrieval test — not a direct head-to-head on a neutral held-out set.
  • 96× requires BGE-base + PCA-256. If you need BGE-M3 (stronger cross-lingual model), you get 32×/48×. The 96× path uses a lighter model.
  • Corpus is text-only. Tables, code, and multi-modal documents were not tested.
  • Float32 RAG download is 2 GB. Budget the bandwidth if you want to verify baseline sizes.

Patents

  • AU 2026901656 — WHT Integer Codec (integer-only binarisation without learned projection)
  • AU 2026901657 — NodeMind Centroid MIH (CTV-based binary fingerprinting + MIH search)

Filed at IP Australia, May 2026. Built solo in Coleambally, regional NSW, Australia.


Links

About

NodeMind — a binary-indexed knowledge graph that replaces vector databases, delivering 48× compression and 75× faster retrieval at a fraction of the cost.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages